Addressing Partitioned Arrays in Distributed Memory Multiprocessors – the Software Virtual Memory Approach
نویسندگان
چکیده
Partitioning distributed arrays to ensure locality of reference is widely recognized as being critical in obtaining good performance on distributed memory multiprocessors. Data partitioning is the process of tiling data arrays and placing the tiles in memory such that a maximum number of data accesses are satisfied from local memory. Unfortunately, data partitioning makes it difficult to physically locate an element of a distributed array. Data tiles with complicated shapes, such as hyperparallelepipeds, exacerbate this addressing problem. In this paper we propose a simple scheme called software virtual memory that allows flexible addressing of partitioned arrays with low runtime overhead. Software virtual memory implements address translation in software using small, one-dimensional pages, and a compiler-generated software page map. Because page sizes are chosen by the compiler, arbitrarily complex data tiles can be used to maximize locality, and because the pages are one-dimensional, runtime address computations are simple and efficient. One-dimensional pages also ensure that software virtual memory is more efficient than simple blocking for rectangular data tiles. Software virtual memory provides good locality for complicated compile-time partitions, thus enabling the use of sophisticated partitioning schemes appearing in recent literature. Software virtual memory can also be used in systems that provide hardware support for virtual memory. Although hardware virtual memory, when used exclusively, eliminates runtime overhead for addressing, we demonstrate that it does not preserve locality of reference to the same extent as software virtual memory.
منابع مشابه
Development of a Parallel DBMS on the Basis of PostgreSQL
The paper describes the architecture and the design of PargreSQL parallel database management system (DBMS) for distributed memory multiprocessors. PargreSQL is based upon PostgreSQL open-source DBMS and exploits partitioned parallelism.
متن کاملSystem Software Support for Reducing Memory Latency on Distributed Shared Memory Multiprocessors
This paper overviews results from our recent work on building customized system software support for Distributed Shared Memory Multiprocessors. The mechanisms and policies outlined in this paper are connected with a single conceptual thread: they all attempt to reduce the memory latency of parallel programs by optimizing critical system services, while hiding the complex architectural details o...
متن کاملEffective Instruction Prefetching In Chip Multiprocessors
threaded application performance, often achieved through instruction level parallelism per chip is increasing, the software and hardware techniques to exploit the potential of studies mostly involve distributed shared memory multiprocessors and fetching will not be fully effective at masking the remote fetch latency. the effective address of the load instructions along that path based upon a hi...
متن کاملVirtual Clusters: Resource Mangement on Large Shared-memory Multiprocessors a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
Despite the fact that large scale shared-memory multiprocessors have been commercially available for several years, system software that fully utilizes all of their features is still not available. These machines require system software that is scalable, supports fault containment, and provides scalable resource management. Software supporting these features is currently unavailable, mostly due...
متن کاملComparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory
Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blocks for software distributed shared memory systems. Two distinct approaches have been used: the fine-grain approach that instruments application loads and stores to support a small coherence granularity, and the coarse-grain approach based on virtual memory hardware that provides coherence at a p...
متن کامل